Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpin numpy #591

Merged
merged 6 commits into from
Nov 4, 2024
Merged

Unpin numpy #591

merged 6 commits into from
Nov 4, 2024

Conversation

hauntsaninja
Copy link
Contributor

@hauntsaninja hauntsaninja commented Jul 23, 2024

See also https://iscinumpy.dev/post/bound-version-constraints/

The workaround here is unnecessary (discussed in #43). Since numpy declares python_requires, pip's resolver will automatically discard newer numpy solutions when installing on Python 3.8.

This also saves you a chore every year when Python makes a new release :-)

See also https://iscinumpy.dev/post/bound-version-constraints/

The workaround here is unnecessary (discussed in google#43). Since numpy declares python_requires, pip's resolver will automatically discard newer numpy solutions when installing on Python 3.8
@hauntsaninja hauntsaninja requested a review from reyammer as a code owner July 23, 2024 04:00
@reyammer
Copy link
Collaborator

Thanks for the PR! Interesting... do you know if numpy started using python_requires only recently? IIRC I couldn't get anything working with the simpler setup that you are proposing... but indeed, the tests seem to pass on all 3.8 => 3.12 python versions, so it looks good to me...

reyammer
reyammer previously approved these changes Aug 20, 2024
@reyammer reyammer requested a review from invernizzi August 20, 2024 14:29
@reyammer reyammer added the python Pull requests that update Python package label Aug 20, 2024
@reyammer reyammer self-requested a review August 20, 2024 15:02
@reyammer
Copy link
Collaborator

OK, I now remember what the problem was. I was just trying to bump numpy version to 2.1.0, which only supports python>=3.10. If I update pyproject.toml with numpy = ">=1.24" as you suggested, and then I try to bump numpy version with poetry update numpy, poetry actually tries to downgrade numpy to: Downgrading numpy (1.26.4 -> 1.24.4), and then crashes. (I'm running python 3.12.)

The only way I found to fix this is to update pyproject.toml with:

numpy = [
    {version = "^1.24", python = ">=3.8,<3.9"},
    {version = "^1.26", python = ">=3.9,<3.10"},
    {version = "^2.1.0", python = ">=3.10,<3.13"},
]

Then, poetry update numpy actually brings numpy to 2.1.0. Here is my PR: #625

This seems more like a "limitation"/requirement from poetry, rather than a requirement from pip?

@gaby
Copy link
Contributor

gaby commented Aug 24, 2024

I'm pretty sure the pin is needed. Not all tools behave the same way.

@reyammer
Copy link
Collaborator

reyammer commented Sep 9, 2024

We are working to port magika project management from poetry to uv (see #663), and it seems the right way of handling our situation with numpy is still to be explicit with target versions vs. python markers such as python version. This should help uv to know what to do and it does not rely on the python_requires from numpy (I'm not sure when this info would be available to pip/poetry/uv? would the resolve need to first download the latest version of numpy to determine "this specific version is not OK given the current setup"?).

This is how it currently looks like:

"numpy>=1.24; python_version >= '3.8' and python_version < '3.9'",
"numpy>=1.26; python_version >= '3.9' and python_version < '3.10'",
"numpy>=2.1.0; python_version >= '3.10'",

The python_version constraints are now specified without the <3.13 upper bound, which means that we shouldn't need to update pyproject.toml for every new version of python.

@hauntsaninja, @gaby, thoughts?

@gaby
Copy link
Contributor

gaby commented Sep 9, 2024

@reyammer That should be fine, the main thing will be keeping track of which versions numpy is dropping. I'm surprised their lowest supported is 3.10 when 3.8/3.9 are still not EOL.

Also 👍 for using uv.

@reyammer
Copy link
Collaborator

Closing this for now; @hauntsaninja please feel free to reopen if you have more context, thanks!

@reyammer reyammer closed this Sep 20, 2024
@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Sep 20, 2024

Sorry, just seeing this!

All of pip/uv/poetry will look at Requires-Python metadata when determining the version of numpy to install. They will not let you install a version of numpy that has a Requires-Python incompatible with your Python version. I promise you that this is so! :-)

I'm not sure when this info would be available to pip/poetry/uv

The way all Python resolvers work is approximately like the following:

  • User requests magika to be installed
  • Tool now needs to figure out what version of magika to install
  • Tool will request metadata from PyPI
  • You can do this yourself! curl https://pypi.org/simple/magika/ -H 'Accept: application/vnd.pypi.simple.v1+json'
  • Tools will then pick a version of magika. This is usually the latest, but tools will filter the available versions based on the requires-python you see in that metadata. In magika's case, installers will probably look at the magika 0.5.1 wheel first
  • Tools will then request the metadata for magika to see what magika's dependencies are in the specific 0.5.1 wheel
  • You can do this yourself! curl https://files.pythonhosted.org/packages/66/79/e1c167ec35060692b70bfc4f2d0aa9314dd7e37ba8e30c1c27965e2f1daa/magika-0.5.1-py3-none-any.whl.metadata
  • This will have a Requires-Dist field that corresponds to what you have in pyproject.toml. Notably, nothing in your poetry.lock is ever exposed to users (will discuss this more later)
  • In magika 0.5.1's case, this is:
Requires-Dist: numpy (>=1.24,<2.0) ; python_version >= "3.8" and python_version < "3.9"
Requires-Dist: numpy (>=1.26,<2.0) ; python_version >= "3.9" and python_version < "3.13"
  • Tools will then recursively do this for numpy (and backtrack if some incompatible set of requirements arises)

poetry actually tries to downgrade numpy

Poetry has a concept of a lockfile. Nothing in here is ever exposed to your users, it's simply for your own development. Poetry's lockfile will try to find a single set of dependencies that you can use across different platforms and Python versions. If it's crashing, it might be because it's trying to compile numpy 1.24 for Python 3.12 from the sdist and failing, which is where your manual splitting of pins might help. uv also has a similar lockfile concept, but maybe its implementation is smart enough to avoid sdists where possible.

My recommendation:

For anything your users will see, this PR will work great, across all tools (i.e. just a simple numpy>=1.24 requirement)

For your own development / lockfiles, you could try some equivalent of:

numpy>=1.24
numpy>=1.26; python_version >= 3.12

since numpy 1.26 is the first version of numpy to ship wheels for Python 3.12

What you have on master currently also seems a little unnecessarily restrictive, e.g. your users on Python 3.10 and newer won't be able to use numpy 1.26

@reyammer reyammer reopened this Oct 7, 2024
@reyammer
Copy link
Collaborator

reyammer commented Oct 7, 2024

Thank you @hauntsaninja! Super useful. Re-opened this PR while I give some thought to it :-)

@hauntsaninja
Copy link
Contributor Author

Updated the PR with my concrete suggestion :-)

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Oct 30, 2024

I think this PR should be good and would solve a problem in a codebase I work on!

astral-sh/uv#8492 is a feature request against uv for them to do this solving automatically when making a lock file.

@reyammer
Copy link
Collaborator

reyammer commented Nov 1, 2024

Hello @hauntsaninja,

thanks so much again for the extended explanation and the PR. I've also started going through https://iscinumpy.dev/post/bound-version-constraints/, great read.

I think I've finally understood what you meant. And I think that, up to now, I've misunderstood what the python_version clause really does.

Note to self with my new understanding:

    [...]
    "numpy>=1.24; python_version < '3.12'",
    "numpy>=1.26; python_version >= '3.12' and python_version < '3.13'",
    "numpy>=2.1.0; python_version >= '3.13'",
    [...]

This means: if you are running python < 3.12, you can use any version of numpy>=1.24 that it's compatible with your python version (and it's up to the numpy package to specify which ones those are); if you are running python 3.12, then you can use any version of numpy>=1.26, as long as it supports python 3.12 (why not a lower version of numpy? Because, for example, numpy 1.25.0 does not support python 3.12); if you are running python 3.13, then you can only consider numpy>=2.1.0, because 2.1.0 is the first version that introduced support for python 3.13.

Does this make any sense? :-)

One other thing I'd like to check with you: are you saying that pip and other resolvers should be able to figure these additional constraints out by themselves? e.g., in case one runs python 3.13, then it should be clear via numpy metadata that one should could only install numpy >= 2.1.0, right? And so, you are saying that such python_version clause is only needed for dev environment to help out uv to figure out how to generate the lock file? I guess that astral-sh/uv#8492 is about that, because even for dev environments such clause should not be technically needed?

Thank you!

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Nov 1, 2024

That makes sense and is basically right!

For your users, a simple numpy>=1.24 is sufficient. magika doesn't rely on any newer numpy APIs, so if a user (and their package manager / resolver) can install any such numpy, magika will work

(and magika doesn't need to get in the business of figuring out whether a user can install such a numpy in whatever environment they're using — that's the job of the user's package manager / resolver)

For your lock file, things are different. uv wants your lock file to be homogenous, so by default uv really really wants to try to pick a single numpy version in your lock file that it thinks will work for 3.8, 3.9, 3.10, 3.11 and 3.12. Now you potentially run into a few issues...

I'll omit some detail, but basically what it means for a given numpy to support a given Python is a little bit of a grey area. For instance, it's probably possible to make numpy 1.26 work on Python 3.13. But numpy 1.26 doesn't have a Python 3.13 wheel, so the way to make this work is for your package manager to compile numpy from source. Workable, but probably a worse experience for you, especially if compilation fails for some reason.

The feature request in astral-sh/uv#8492 is for uv to basically be more willing to trade off "in lock file, only use a single version of numpy across all different versions of Python" in favour of "in lock file, use multiple versions of numpy if it means you avoid having to compile numpy from source". If implemented, we wouldn't have to manually trick uv into doing this the way this PR does. That means this PR could be simplified to being a simple numpy>=1.24 and it would work well both for your lock file and for your users.

@reyammer reyammer merged commit 317f3c1 into google:main Nov 4, 2024
14 checks passed
@reyammer
Copy link
Collaborator

reyammer commented Nov 4, 2024

All makes sense, thank you! Merged. Sorry for the delay and thanks for the extended explanations, I've learnt a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests that update Python package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants